Evaluation of Musical Feature Extraction Tools Using Perceptual Ratings
نویسنده
چکیده
The increasing availability of digital music has created a demand for organizing and retrieving the music. Thus, a new multi-disciplinary research area called music information retrieval, MIR, has emerged. An important part of the content-based field of the research area is to extract musical features, such as tempo or modality, directly from the content, i.e. the audio. This thesis is an evaluation of available musical feature extraction tools. The evaluation is done by using the extracted musical features as predictors for perceptual ratings in correlation and regression analyses. The 11 perceptual ratings were gathered from a listening test. 22 musical audio features were extracted from the same stimuli, using different systems for musical feature extraction. These were chosen to predict some of the perceptual ratings based on findings in literature. High inter-subject reliability in the listening test implied a high agreement among the subjects, indicating that 20 subjects were enough. Fairly low inter-correlations between the ratings indicated that they were rated independently from each other. Six out of seven perceptual ratings with a priori selected predictors correlated moderately with their corresponding predictor (r>0.6). The results from the stepwise regression analyses were also moderate, where the amount of variance explained by the predictors ranged from 29-76%, indicating that there is room for improvements for developing new feature extraction algorithms. Utvärdering av verktyg för insamling av musikaliska egenskaper med hjälp av perceptuella bedömningar Sammanfattning Den ökande tillgängligheten av digital musik har skapat ett behov av att organisera och hitta musiken. På grund av detta har ett nytt tvärvetenskapligt forskningsområde vuxit fram, kallat music information retrieval eller MIR. En viktig del av den innehållsbaserade grenen av detta forskningsområde är att extrahera musikaliska egenskaper, såsom tempo eller modalitet, direkt från innehållet, dvs. ljudklippet. Den här uppsatsen är en utvärdering av tillgängliga verktyg för att extrahera musikaliska egenskaper. Utvärderingen är en jämförelse som görs genom att använda dessa extraherade musikaliska egenskaper som prediktorer för perceptuella bedömningar, med hjälp av korrelationsoch regressionsanalyser. De elva perceptuella bedömningarna samlades in från ett lyssningsförsök. 22 musikaliska egenskaper extraherades från samma stimuli med hjälp av olika system för att extrahera egenskaper från musik. Dessa musikaliska egenskaper valdes ut för att predicera några av de perceptuella bedömningarna, baserat på litteratur. Hög validitet mellan försökspersonerna i lyssningsförsöket tyder på enighet mellan bedömningarna, vilket antyder att 20 försökspersoner räckte. Ganska låga korrelationer tyder på att bedömningarna av olika egenskaper var oberoende av varandra. Sex av sju perceptuella bedömningar med på förhand utvalda prediktorer korrelerade måttligt med motsvarande prediktor (r>0.6). Resultatet från den stegvisa regressionsanalysen var också måttlig, där 29-76% av variansen kunde förklaras av prediktorerna, vilket visar att det finns rum för förbättringar av algoritmer för att extrahera musikaliska egenskaper.
منابع مشابه
Image authentication using LBP-based perceptual image hashing
Feature extraction is a main step in all perceptual image hashing schemes in which robust features will led to better results in perceptual robustness. Simplicity, discriminative power, computational efficiency and robustness to illumination changes are counted as distinguished properties of Local Binary Pattern features. In this paper, we investigate the use of local binary patterns for percep...
متن کاملMusical Instrument Family Classification
A method to classify sounds of musical instruments (single monophonic notes) is introduced. The classification is done using in parallel two sets of perceptual features extracted from the sounds. The models used are a mixture of gaussians, whose parameters where found by training over a database of target sound families. The feature extraction procedure, model training and model usage for class...
متن کاملDesign and Optimization of Neuro-Fuzzy-Based Recognition of Musical Rhythm Patterns
The task of recognizing patterns and assigning rhythmic structure to unquantized musical input is a fundamental one for interactive musical systems and for searching musical databases since melody is based on rhythm. We use a combination of combinatorial pattern matching and structural interpretation with a match quality rating by a neuro-fuzzy system that incorporates musical knowledge and ope...
متن کاملEvaluation of Audio Beat Tracking and Music Tempo Extraction Algorithms
This is an extended analysis of eight different algorithms for musical tempo extraction and beat tracking. The algorithms participated in the 2006 Music Information Retrieval Evaluation eXchange (MIREX), where they were evaluated using a set of 140 musical excerpts, each with beats annotated by 40 different listeners. Performance metrics were constructed to measure the algorithms’ abilities to ...
متن کاملPredicting the perception of performed dynamics in music audio with ensemble learning.
By varying the dynamics in a musical performance, the musician can convey structure and different expressions. Spectral properties of most musical instruments change in a complex way with the performed dynamics, but dedicated audio features for modeling the parameter are lacking. In this study, feature extraction methods were developed to capture relevant attributes related to spectral characte...
متن کامل